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A  PARADIGM  FOR  EFnCIENT  SUBSET  RECOGNITION 


1  Traditional  Automata  and  Sets 

Consider  the  number  of  symbols  in  a  regular  expression  (excluding  operators  and  paren¬ 
theses)  that  would  be  required  to  specify  a  traditional  deterministic  finite  state  automaton 
(DFA)  capable  of  determining  whether  n  input  sybols  form  a  set  equal  to  a  given  set  5 
of  n  elements.  Since  a  DFA  recognizes  only  strings,  all  n\  permutations  of  the  n  elements 
comprising  S  must  be  explicitly  specified  in  the  expression.  Thus,  a  regular  expression  in 
simple  sum-of-products  form  would  require  n-n!  symbols.  However,  closer  inspection  reveals 
that  sum-of-products  form  does  not,  in  general,  result  in  a  minimal  expression.  Since  there 
are  only  n  distinct  symbols,  the  n!  products  may  be  partitioned  into  n  groups  according 
to  their  first  symbol  and  then  factored.  In  other  words,  each  of  the  n  sums  of  n-symbol 
products  with  identical  first  symbols  can  be  converted  into  a  product  of  the  factored  symbol 
and  the  sum  of  the  remaining  (n  —  l)-symbol  products.  This  process  may  then  be  applied 
recursively  on  the  new  sums-of-products  and  on  their  resulting  sums-of-products  until  no 
products  within  a  particular  partition  share  the  same  first  symbol. 


The  number  of  symbols  resulting  after  the  application  of  tlris  process  can  be  described  by 
the  difference  equation  /(n)  =  n  •  /(n  —  1)  -b  n.  This  equation  states  that  the  number 
of  symbols  required  to  specify  a  machine  which  will  recognize  a  set  of  n  elements  is  n 
times  the  number  of  symbols  required  to  specify  a  machine  capable  of  recognizing  a  set  of 
(n  —  1)  elements  plus  the  n  symbols  which  remain  after  factoring  the  common  first  symbols. 
(By  definition,  /(I)  =  1  since  a  single  symbol  has  no  permutations  other  than  itself.)  In 
more  numerically  enlightening  terms,  this  equation  reveals  that  the  number  of  symbols  in 
a  regular  expression  required  to  specify  a  machine  capable  of  recognizing  a  set  5  is  equal 
to  the  sum  of  the  number  of  r-permutations  (r  =  1  to  |5|)  over  the  elements  of  5: 


From  this  expression  it  can  be  shown  that  n!  <  /(n)  <  n\c  and  thus  is  0(t)!). 


Manuscnpt  approved  January  19.  1990. 
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It  shotild  be  clear  from  this  result  that  the  use  of  regular  expressions  to  specify  sets  is  im¬ 
practical.  However,  a  new  form  of  expression  shall  be  defined  which  is  capable  of  efficiently 
specifying  large,  complex  set  definitions.  This  notation,  which  will  be  referred  to  as  set- 
expressions,  will  prove  to  be  a  powerful  tool  for  expressing  a  wide  variety  of  set  recognition 
problems. 

2  Set-Expressions 

Let  f/  be  a  finite  universal  set,  or  domain.  The  sets  denoted  by  set-expressions  over  U  are 
defined  recursively  as  follows: 

1.  If  a  €  17,  then  a  is  a  set-expression  which  denotes  the  set  {{a}). 

2.  If  El  and  E2  are  set-expressions  which  denote  sets  and  S2.  respectively,  then 
(El  -I-  £'2)  is  a  set-expression  which  denotes  the  set  5i  U  52. 

3.  If  El  and  £2  are  set-expressions  w’hich  denote  sets  5i  and  52.  respectively,  then 
(El  •  £2),  optionally  written  E1E2,  denotes  the  set  {<j  U  <2  |  <1  €  5i  and  t2  €  52}. 

4.  If  El  and  £2  are  set-expressions  which  denote  sets  5i  and  52,  repectivoly.  then  (£1  - 
£2)  denotes  the  set  Si  —  S2  (where  in  this  case  represents  ordinary  set  difference). 

5.  If  El  and  £2  are  set-expressions  which  denote  sets  5i  and  S2,  repectivoly.  then  [Eij  E2 ) 
denotes  the  set  {ti  —  (tj  n  <2)  I  €  5i  and  (2  €  52}. 

(Note  1:  The  and  ‘-f’  are  read  as  AND  and  OR,  respectively.  .Note  2:  Mair  parentheses 
may  be  avoided  without  confusion  if  the  AND  operator  is  assumed  to  take  precedence  over 
OR.  Note  3:  the  ‘/’  and  operators  and  the  and  '-f'  operators  are  not  in  general 
inverses;  e.g.,  (A  *  B)/B  =  A  if  and  only  if  every  elemental  set  in  A  is  disjoint  with  every 
elemental  set  in  B.  Similarly,  (A  B)  -  B  =  A  \{  and  only  it  A  and  B  are  disjoint.  The  '/' 
and  operators  are  discussed  briefly  in  the  appendix.) 


For  example,  the  set-expression  ab  denotes  the  set  {{a,  6}}.  The  expression  a+b,  on  the  other 
hand,  specifies  the  set  {{a}, {6}}.  The  more  complicated  expression  (a  -|-  b)(c  -|-  d)  denotes 
the  set  {{a,c},  {a,d},  {b,c},  {b,d}}.  From  the  definition  and  examples,  one  should  notice 
that  although  set-expressions  operate  on  sets  of  sets,  they  are  very  similar  to  ordinary  set 
notation  both  in  terms  of  intuition  and  flexibility.  In  order  to  further  parallel  this  similarity, 
the  universal  set  U  will  be  regarded  as  a  set-expression  denoting  a  set  of  single-element  sets. 
This  wiU  permit  expressions  such  as  U  -  ak  without  any  confusion  as  to  what  is  denoted. 
An  exponential  operator  can  also  be  assumed  which  specifies  repeated  applications  of  the 
AND  operation.  For  example,  the  expression  (ai  -I-  aa  +  •••  +  On)*.  or  just  ^  specifies  the 
set  of  subsets  over  f/,  |f/|  =  n,  of  A:  or  fewer  elements.  Similarly,  one  can  concisely  define 
sets  such  as  the  set  of  all  subsets  over  a  given  U  minus  an  element  nt,  or  the  set  of  all 
subsets  over  a  given  U  all  of  which  contain  an  element  Ok- 


3  The  Set-Recognizing  Automaton 

An  SRA  (Set  Recognizing  Automaton),  like  automata  for  strings,  consists  of  a  finite  set  of 
states  and  a  set  of  transitions  which  determines  the  subsequent  state  of  the  machine  from 
the  current  state  of  the  machine  and  an  input  symbol.  It  has  an  initial  state  qo  and  a  set 
of  final  states  F.  Unlike  automata  for  strings,  SRA’s  are  defined  over  a  set  of  elements 
U ,  rather  than  an  alphabet,  which  represents  the  universal  set.  or  domain,  of  its  inputs. 
This  universal  set  differs  from  an  alphabet  in  that  it  has  an  associated  linear  ordering  0. 
(The  choice  of  an  ordering  function  6  for  any  particular  SRA  is  arbitrary  simply  because 
the  ordering  of  the  members  of  a  set,  unlike  the  ordering  of  symbols  comprising  a  string, 
is  completely  arbitrary.  For  practical  purposes,  thougli,  if  the  elements  of  T  are  viewed  as 
strings  then  it  is  convenient  to  let  0  represent  usual  lexicographic  order.)  All  input  to  an 
SRA  is  a.ssumed  to  be  ^-ordered. 

Also  like  automata  for  strings,  an  SRA  has  an  associated  transition  diagram.  This  diagram 
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consists  of  a  finite,  directed  graph  in  which  each  state  is  represented  as  a  vertex  and  each 
transition  from  a  state  p  to  a  state  q  on  an  element  e,  is  represented  as  a  directed  arc 
from  the  vertex  p  to  the  vertex  q  labelled  e,.  A  restriction  will  be  imposed  upon  the  set 
of  allowable  transitions  that  0  of  the  label  of  any  arc  leaving  a  particular  vertex  must  be 
greater  than  0  of  any  arc  entering  that  vertex.  This  restriction  enforces,  among  other  things, 
the  definition  of  a  set  as  a  collection  of  dislinct  objects.  Since  an  SR  A  assumes  ^-ordered 
input,  this  restriction  does  not  impose  any  practical  limitations;  however,  its  existence  will 
be  important  for  subsequent  mathematical  analyses  of  SRA’s. 


Formally,  an  SRA  wiR  be  denoted  as  a  6-tupie  (Q,U,0,6,qo,  F)  where  is  a  finite  set  of 
states,  is  a  finite  set  of  elements,  is  a  bijection  which  maps  the  i  elements  in  U  onto  the 
integers  1  to  i,  ^  is  a  function  which  maps  Q  x  U  to  Q  (i.e.  6{q.e)  6  Q).  qo  is  an  element 
of  Q  representing  the  initial  state  of  the  machine,  and  F  is  a  set  of  states  from  Q  which 
represents  final  (or  accepting)  states  of  the  machine.  An  SRA  M  is  said  to  recognize  a  set 
5  =  {ej,  ej, €k}  (ordered  according  to  0)  if  and  only  if  e,  is  a  transition  leaving  qo.  Ck  is  a 
transition  to  a  final  state,  and  for  every  element  e,,  i  <  k.  there  exists  an  arc  labeled  e,  to  a 
state  with  an  arc  labeled  e,+i  leaving  it.  In  other  words,  M  =  (Q,U.0.S.qo.  F]  recognizes 
S  if  and  only  if  there  exists  a  path  €1,62,  ■■.,ek  from  qo  to  some  state  r  ^  F.  (Uppercase 
S  wiU  be  used  to  denote  a  particular  set  recognized  by  a  machine  M  while  reserving  the 
notation  S{M)  to  refer  to  the  coUection  (or  set)  of  sets  which  can  be  recognized  by  M .) 


From  the  preceding  definitions,  some  useful  observations  can  be  made  concerning  the  ma.\- 
imum  size  (i.e.  number  of  states)  an  SRA  can  have  for  a  given  U  of  ii  elements.  Because 
of  the  imposed  restriction  that  0{ej)  >  0{e,)  for  every  arc  leaving  a  state  q  and  every 
arc  e,  entering  q,  it  should  be  apparent  that  any  arc  Ck.  0{k)  =  k,  entering  q  limits  the 


number  of  arcs  leaving  q  to  a  maximum  of  n  —  k.  Thus,  there  is  a  ma.ximum  of 


(A 


states 


which  can  be  reached  after  consuming  k  symbols.  Since  an  SRA  can  consume  no  more  than 
n  symbols,  the  binomial  theorem  impbes  there  is  a  maximum  of  2"  (the  sum  from  0  to  n 
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/  \ 
n 


of  I  I )  states  in  the  entire  machine.  This  is  not  surprising  since  as  many  as  2”  subsets 

of  any  set  of  n  elements  can  be  specified,  and  one  would  expect  there  to  be  at  least  one 
state  per  recognized  set.  (Actually,  if  final  states  are  merely  required  to  signal  acceptance 
or  non-acceptance,  2n  final  states  may  be  coalesced  into  a  single  state,  thus  limiting  the 
maximum  to  +  !)■  What  is  also  important  to  note  is  that  the  linear  ordering  imposed 
by  9  implies  that  the  number  of  transitions  in  the  graph  is  linear  in  the  number  of  states. 


4  SRA  Construction  Algorithm 

In  order  to  demonstrate  the  practicality  of  this  theoretical  machine,  it  is  important  to 
develop  algorithms  for  implementing  it  in  a  high-level  programming  language.  The  strategy 
will  consist  of  constructing  a  routine  which  will  convert  a  given  set-expression  into  the 
underlying  graphical  structure  of  its  equivalent  SRA  and  then  creating  a  driver  to  animate 
the  graph  for  effective  set  recognition.  More  powerful  forms  of  SRA's  will  be  examined  and 
an  evaluation  of  their  capabilities  and  limitations  will  be  undertaken. 

The  input  to  the  construction  routine  will  consist  of  a  list  U  and  a  set-expression.  (For 
expressive  convenience,  the  use  of  the  symbols  OR  and  will  be  used  interchangeably 
to  denote  the  OR  operation  and  the  symbols  AND  and  ♦  will  be  used  interchangeably  - 
or  even  omitted  -  to  denote  the  AND  operation.)  The  routine  will  then  evaluate  each 
operator  and  construct  piece  by  piece  the  final  machine.  Each  state  of  this  machine  will 
be  represented  as  a  triple  (aT /),  where  a  is  an  element  of  U  which,  from  the  context  of 
the  machine,  uniquely  identifies  the  state;  T  is  a  list  of  states  to  which  transitions  can  be 
made;  and  /  signifies  whether  the  state  is  a  final  state  or  not.  Thus,  at  the  top  level  the 
machine  will  appear  as  the  transition  list  of  the  initial  state.  A  hst  representation  of  triples 
is  the  only  data  structure  needed.  For  notational  convenience,  the  A:th  triple  in  a  list  L  of 
triples  will  be  denoted  L{k).  Furthermore,  elements  of  this  triple  will  be  specified  by  using 
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the  notation  L{k).trans^ymb  to  access  the  first  element  of  the  triple,  L(k).transJis1  to 
access  the  second  element,  and  L{k). final j^t ate  to  refer  to  the  third  element.  A  function 
nextJriple{)  will  be  assumed  which  extracts  and  returns  the  first  triple  from  a  list  of  triples. 

The  following  defines  a  simple  mechanism  for  evaluating  a  set  expression: 

function:  EVALUATE  (EXP); 

local  variables:  prod,  sum,  symbol; 
do  forever; 

if  EXP  is  empty  then  return  0R(sum,  prod); 
get_symb:  symbol  < —  NEXT_SYMBOL(EXP) ; 
if  symbol  =  or  symbol  =  ‘AND*  then  goto  get_symb; 
if  symbol  =  '+’  or  symbol  =  ‘OR’  then  begin; 
sum  < —  QR(sum,  prod); 
prod  < —  nil; 
end; 

else  begin; 

if  symbol  is  an  expression 

then  symbol  < —  EVALUATE(symbol) ; 
else  symbol  < —  triple (symbol,  nil,  true); 
if  prod  is  empty  then  prod  < —  symbol; 
else  prod  < —  AND(symbol,  prod); 
end; 
end; 
end; 

The  basic  logic  of  this  function  consists  of  a  left-to-right  evaluation  of  the  expression  as 
a  sum-of-products  where  each  symbol  is  an  atom,  an  expression,  or  an  operator.  If  the 
symbol  is  an  atom,  a  triple  is  constructed  representing  a  final  state;  if  it  is  an  expression,  it 
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is  evaluated;  if  it  is  an  AND  operator,  it  is  ignored  since  adjacent  operands  are  assumed  to 
represent  products;  and  if  it  is  an  OR  operator  (or  the  end  of  the  expression),  an  OR  oper¬ 
ation  is  performed.  Since  the  literals  AND,  OR,  *,  and  -|-  are  used  to  represent  operators, 
they  should  not  be  elements  of  U.  In  other  words,  they  cannot  be  interpreted  as  operands 
by  the  above  algorithm. 

From  the  EVALUATE  {unction  one  can  see  that  the  OR  operation  is  performed  by  a  function 
OR.  This  function  is  defined  as  foUows: 

fimction:  OR  (Ml,  M2); 

local  variables;  tl,  t2,  machine; 
do  forever; 

if  Ml  is  empty  then  add  M2  to  machine  and  return  machine ; 
if  M2  is  empty  then  add  Ml  to  machine  and  return  machine; 
if  Ml (1) .trans.symb  <  M2(l) .trans.symb 
then  add  next_triple(Ml)  to  machine; 
else  if  M2(l) .tran.symb  <  Ml(l) .trans.symb 
then  add  next_triple(M2)  to  machine; 
else  begin; 

tl  < —  next_triple(Ml) ; 
t2  < —  next_triple(M2) ; 
add  triple  (tl .trans_symb, 

0R(tl .trans.list,  t2.trans_list) , 
tl .f inal.state  |  t2.f inal.state) 

to  machine; 
end; 
end; 
end; 
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The  relatively  small  amount  of  code  belies  the  deeper  complexity  of  this  function.  Thus,  a 
detailed  explanation  is  necessary  to  demonstrate  its  correctness. 

Observe  that  the  OR  operation  in  a  set-expression  specifies  a  pair  of  alternative  membership 
requirements,  one  of  which  must  be  statisfied  if  a  set  is  to  be  accepted  by  the  machine 
described.  This  operation  can  be  performed  by  unioning  the  top-level  transition  lists  by 
width.  By  width  is  specified  because  it  is  clear  that  the  result  should  have  the  same  depth 
(i.e.  level  of  nesting)  as  the  alternative  of  greatest  depth  since  the  OR  operation  does  not 
affect  the  elemental  sets  of  the  sets  described  by  its  operands.  In  terms  of  machines,  this 
amounts  to  coalescing  the  initial  states  of  two  machines.  Hence,  if  a  trivial  machine  A  which 
consists  of  nothing  more  than  an  empty  transition  list  (i.e.  the  machine  recongizes  no  sets) 
is  ORed  with  a  non-trivial  machine  B,  the  result  will  be  a  machine  identical  to  B.  Thus, 
the  first  two  tests  in  the  function  merely  determine  whether  one  of  the  machines  is  trivial 
and,  if  so,  returns  the  other  unchanged. 

The  next  three  tests  assume  lexicographic  order  for  6.  They  determine  which  transition 
under  consideration  from  A/1  and  M2  is  to  be  placed  into  the  union  in  order  to  maintain 
ascending  order  by  9.  The  transition  selected  is  then  appended  to  the  hst  machine.  If  the 
two  transitions  under  consideration  are  not  distinct  (i.e.  represent  identical  states),  they 
are  coalesced  into  a  single  transition  by  performing  an  OR  of  their  transition  lists  and  a 
logical-or  operation  is  necessary  to  ensure  that  the  resulting  state  is  marked  as  final  if  either 
of  the  states  coalesced  to  produce  it  are  final  states.  For  example,' 

Ml  < —  (A  (B  0  true  C  ()  true)  false);  /*  Ml  =  AB+AC  */ 

M2  < —  (A  (D  ()  true)  false  D  ()  true);  /*  M2  =  AD+D  */ 

0R(M1,  M2); 

'For  notationaJ  convenience  one  level  of  parentheses  will  be  omitted  (e.g.,  a  list  of  triples  such  as  {(a  6 
c)  (d  e  /))  will  be  written  is  (a  b  c  d  e  f),  where  triples  are  identified  by  context  rather  than  explicitly  by 
parentheses). 


8 


returns  (A  (B  ()  true  C  ()  true  D  ()  true)  false  D  ()  true)  which  is  equivalent  to  the 
set-expression  A  *  (J?  -f  C  -f  Z?)  4-  D,  which  denotes  the  sets  AB,  AC ,  AD,  and  D.  An 
examination  of  the  algorithm  reveals  that  if  Mi  ls  a.  set  of  m  sets  and  M2  is  a  set  of  n  sets, 
then  in  the  worst  case  OR  will  have  to  compare  m  4-  n  sets;  however,  one  should  expect  the 
average  case  to  require  many  fewer  comparisons. 

From  EVALUATE  it  can  be  seen  that  the  AND  operation  is  perfomed  by  the  function 
AND.  This  function  is  similar  in  logic  to  the  OR  function  except  that  the  unioning  takes 
place  vertically  rather  than  horizontally.  As  a  result,  the  process  is  applied  recursively  in 
the  case  of  AND  rather  than  iteratively  as  in  OR.  However,  because  each  pair  Oi  triples 
from  Ml  X  M2  must  be  considered,  the  process  must  be  repeated  for  every  triple  in  M 1 
and  M2.  The  definition  is  as  follows; 

function:  AND(M1,  M2); 

local  variables:  tl,  t2,  machine,  temp; 
do  forever; 

if  Ml  is  empty  or  M2  is  empty  then  return  machine; 
if  Ml(l) .trans.symb  <  M2(l) .trans_symb  then  begin; 
tl  < —  next_triple(Ml) ; 
tl .trans_list  < —  AND(tl .trans.list ,  M2); 
if  tl .trans.list  is  true  then 

tl .trans.list  < —  0R(tl .trans.list ,  M2); 
add  tl  to  machine; 
end; 

else  if  M2(l) .trans.symb  <  Ml (1) .trans_symb  then  begin; 
t2  < —  next_triple(M2) ; 
t2.trans_list  < —  AND(t2 . trans.list ,  Ml); 
if  t2.f inal.state  is  true  then 
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t2.trans_list  < —  0R(t2 .trans.list ,  Ml); 
add  t2  to  machine; 
end; 

else  begin; 

tl  < —  next_triple(Ml) ; 
t2  < —  next_triple(M2) ; 
f state  < —  false; 
if  t2.trans_list  is  empty  then 
fstate  < —  tl .f inal.state; 
if  tl .trans_list  is  empty  then 

fstate  < —  fstate  |  t2 .f inal_state ; 
temp  < —  AND(tl .trans.list ,  t2.trans_list) ; 
if  tl . f inal.state  is  true  then 
temp  < —  0R(temp,  M2); 
if  t2.f inal.state  is  true  then 
temp  < —  0R(temp,  Ml); 

add  triple (tl .trans.symb,  temp,  fstate)  to  machine; 
end; 
end; 
end; 

A  close  inspection  of  this  code  reveals  why  each  state  is  initially  marked  as  final  in  E\’ALV- 
ATE\  the  process  of  performing  a  vertical  union  of  two  sets  results  in  a  filtering  of  final  state 
demarcations  to  the  transition  element  of  greatest  value.  In  other  words.  AND  ensures 
that  a  final  state  which  signifies  the  acceptance  of  a  particular  set  cannot  be  reached  until 
all  of  the  ^-ordered  elements  of  that  set  have  been  consumed.  Thus,  states  are  tentatively 
marked  as  final  until  subsequent  processing  of  the  set-expression  leads  to  an  AND  operation 
which  explicitly  specifies  that  it  should  be  marked  otherwise.  For  example. 
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Ml  < —  (A  (B  ()  true  C  ()  true)  false);  /*  Ml  =  AB+AC  */ 

M2  < —  (A  (D  0  true)  false  D  ()  true);  /*  M2  =  AD+D  ♦/ 

AWD(M1,  M2); 

returns  {A  {B  {D  ()  true)  false  C  {D  ()  true)  false)  false)  which  is  equivalent  to  the  set- 
expression  A  *  {{B  +  C)  *  D),  which  denotes  the  sets  ABD  and  AC D.  An  examination  of 
the  algorithm  reveals  that  if  Ml  is  a  set  of  m  sets  and  M2  is  a  set  of  n  sets,  then  (^1/1  ■M2) 
will  cause  AND  to  merge  at  most  van  pairs  of  sets. 

Given  that  a  deterministic  graphical  representation  of  an  SRA  can  be  generated,  the  con¬ 
struction  of  a  driver  is  straightforward.  The  following  function  accepts  an  input  list  S 
(representing  a  6-ordered  set  over  U)  and  an  SRA  M  and  returns  a  logical  constant  signi¬ 
fying  whether  the  set  is  recognized  by  the  given  machine; 

function:  DRIVER  (S,  M) ; 

local  variables;  symbol,  triple; 
if  S  is  empty  then  return  false; 
symbol  < —  next_symb(S) ; 
do  while  M  is  not  empty; 

triple  < —  next_triple(M) ; 

if  triple .trans.symb  =  symbol  then  begin; 

if  S  is  empty  then  return  triple .final.state ; 
symbol  < —  next_symb(S) ; 

M  < —  triple .trans.list ; 
end; 
end; 

return  false; 
end; 
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5  Enhanced  Construction  Algorithm 


The  machine  resulting  from  the  construction  just  described  effectively  recognizes  the  set- 
s  specified  by  its  corresponding  set-expression  definition:  however,  for  most  practical  set 
recognition  applications,  an  SRA  needs  to  reveal  not  only  that  some  set  has  been  recog¬ 
nized,  but  specifically  what  set  has  been  recognized.  In  other  words,  there  is  a  need  to 
associate  relevant  information  (e.g.  what  has  been  recognized,  actions  to  be  taken,  etc.) 
about  a  set  with  the  final  state  that  accepts  it.  This  can  be  accomplished  through  the 
following  enhancements  to  the  previous  construction. 

The  following  procedure  accepts  a  list  of  pairs,  each  consisting  of  a  set-expression  and  a  list 
(or  atom)  of  information  to  be  associated  with  the  final  state  signalling  its  acceptance; 

function:  DEFINITIONS  (EXPRESSIONS); 

local  variables:  expression,  machine,  fs.info; 
do  while  EXPRESSIONS  is  not  empty; 

expression  < —  next_expression(EXPRESSIONS) ; 
fs.info  < —  expression. final_state_information; 
machine  < —  OR(machine,  EVALUATE(expression . exp ,  fs.info)); 
end; 

return  machine; 
end; 

The  following  is  an  example  of  a  definition  which  specifies  a  machine  which  recognizes  the 
sets  of  all  odd  or  even  numbers  between  1  and  4  (i.e.  U  =  {1.2,  3. 4}): 

M  <—  (  (1  AND  3)  ODD 

(2  AND  4)  EVEN  ) ; 

DEFINITIONS(M) ; 


12 


The  routine  DEFINITIONSviovild  process  each  pair  and  return  the  machine  ( 1  (3  ( )  { ODD)) 
0  2  (4  0  [EVEN))  0). 

In  order  to  associate  the  appropriate  information  with  the  appropriate  final  state,  the 
EVALUATE  Tout'me  must  be  enhanced  so  that  it  receives  an  additional  argument: 

function:  EVALUATE  (EXP,  FS.INFO);  /*  Enhanced  version  */ 
local  variables:  prod,  sum,  symbol; 
do  forever; 

if  EXP  is  empty  then  return  0R(siim,  prod)  ; 
get.symb:  symbol  <—  NEXT_SYMBOL(EXP) ; 
if  symbol  =  or  symbol  =  'AND’  then  goto  getsymb; 
if  symbol  =  '+’  or  symbol  =  ‘OR*  then  begin; 
sum  < —  0R(sum,  prod); 
prod  =  nil; 
end; 

else  begin; 

if  symbol  is  an  expression 

then  symbol  <—  EVALUATE(symbol ,  FS.INFO) ; 
else  symbol  < —  triple (symbol ,  nil,  FS.INFO) ; 
if  prod  is  empty  then  prod  < —  symbol; 
else  prod  < —  AND(symbol,  prod); 
end; 
end; 
end; 

The  following  changes  must  then  be  made  to  OR  and  AND  to  enable  them  to  work  with  final 
state  markers  that  are  sets  rather  than  logical  constants.  This  entails  the  use  of  a  function 
union()  to  handle  the  possibility  that  various  set-expressions  might  specify  non-disjoint  sets 
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(i.e.  single  states  can  result  which  recognize  logically  different  sets.  These  states  must  then 
have  multiple  information  lists  associated  with  them). 

fvmction:  OR  (Ml,  M2):  /*  Enhanced  version  */ 

local  variables;  tl,  t2,  machine; 
do  forever; 

if  Ml  is  empty  then  add  M2  to  machine  and  return  machine; 
if  M2  is  empty  then  add  Ml  to  machine  and  return  machine; 
if  Ml(l) .trans.symb  <  M2(l) .trans.symb 
then  add  next_triple(Ml)  to  machine; 
else  if  M2(l) .trans.symb  <  Ml(l) .trams.symb 
then  add  next_triple(M2)  to  machine; 
else  begin; 

tl  < —  next_triple(Ml) ; 
t2  < —  next.triple(M2) ; 
add  triple  (tl .trans.symb, 

0R(tl .trans.list,  t2.trans_list) , 
union(tl .final. state ,  t2.f inal.state) ) 

to  machine; 
end; 
end; 
end; 

function:  AND(M1,  M2):  /*  Enhanced  version  ♦/ 

local  variables;  tl,  t2,  machine,  temp; 
do  forever; 

if  Ml  is  empty  or  M2  is  empty  then  return  machine; 
if  Ml (1) .trans.symb  <  M2 (1) .trans.symb  then  begin; 
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tl  < —  next_triple(Ml) ; 

tl.traois.list  <—  ANDCtl  .trans.list ,  M2); 
if  tl .trans.list  is  true  then 

tl  .traois.list  <--  0R(tl  .trans.list ,  M2); 
add  tl  to  machine; 
end; 

else  if  M2(l) .trans.symb  <  Ml (1) -trans.symb  then  begin; 
t2  < —  next_triple(M2) ; 

t2.trans_list  <--  AND(t2.trans.list,  Ml); 
if  t2.f inal.state  is  true  then 

t2.trans_list  <—  0R(t2.trans_list,  Ml); 
add  t2  to  machine; 
end; 

else  begin; 

tl  < —  next_triple(Ml) ; 
t2  < —  next_triple(M2) ; 
f state  < —  false; 
if  t2.trans_list  is  empty  then 
f state  < —  tl .f inal.state; 
if  tl .trans.list  is  empty  then 

fstate  < —  unionCf state ,  t2.f inal.state) ; 
temp  <—  ANDCtl.trans.list,  t2.trans.list) ; 
if  tl .f inal.state  is  true  then 
temp  <--  0R(temp,  M2); 
if  t2,f inal.state  is  true  then 
temp  < —  ORCtemp,  Ml); 

add  triple (tl .trans.symb,  temp,  fstate)  to  machine; 
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end; 


<3nd; 

end; 

The  construction  just  described  results  in  a  machine  in  which  each  final  state  provides 
precise  information  about  the  input  consumed  thus  far.  Unfortunately,  the  information  is 
probably  too  precise  for  many  applications.  For  example,  if  the  machine  is  driven  so  that 
information  is  yielded  only  from  the  state  the  machine  happens  to  be  in  after  exhausting  the 
input,  a  set  which  does  not  leave  the  machine  in  a  final  state  will  merely  be  signalled  as  ‘not 
accepted’  even  though  a  subset  of  that  set  may  have  led  the  machine  through  a  final  state. 
A  solution  to  this  problem  might  be  to  design  the  driver  to  output  information  only  from  the 
first  final  state  reached.  Such  a  design  would  guarantee  that  if  any  subsets  of  the  input  lead 
the  machine  through  a  final  state,  the  machine  will  yield  some  information:  however,  any 
information  about  symbols  consumed  after  that  final  state  will  be  lost.  Similarly,  having  the 
machine  output  information  from  only  the  last  final  state  reached  will  not  reveal  whether 
any  smaller  subsets  of  the  input  were  recognizable.  What  is  needed  is  a  machine  capable  of 
providing  information  about  all  subsets  of  the  input  which  are  recognizable. 

Ic  might  appear  at  first  that  by  yielding  output  at  each  final  state  reached  during  the  con¬ 
sumption  of  a  given  input,  information  would  be  obtained  about  all  subsets  of  that  input 
which  are  recognizable.  However,  a  closer  inspection  of  the  machine  will  reveal  the  in¬ 
adequacy  of  this  approach.  Let  M  be  a  machine  described  by  the  set-expression  abc  +  b 
(U  =  {a,6,c}).  If  M  is  given  the  input  ab,  it  will  end  in  a  non-final  state  even  though 
b  represents  a  recognizable  subset.  In  order  to  enable  the  machine  to  output  information 
about  every  recognizable  subset  for  every  possible  input,  a  more  sopliisticated  (and  compu¬ 
tationally  more  expensive)  construction  algorithm  is  necessary. 
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6  The  Subset  Machine 

A  Subset  Machine  is  defined  to  be  a  special  type  of  SRA  denoted  by  a  7-tuple  {Q.  U, 
6,  qo,  A,  A)  in  which  A  is  an  output  set  representing  units  of  information  associated  with 
particular  states  from  Q  by  the  relation  A.  For  the  immediate  purposes  A  will  be  considered 
to  be  the  set  of  information  lists  associated  with  a  set  of  set-expressions.  However,  instead 
of  associating  an  element  of  A  with  a  state  if  and  only  if  that  state  recognizes  precisely  a 
set  specified  by  that  element’s  a.ssociated  set-expression,  A  wiU  be  considered  to  associate 
a  state  with  an  element  of  A  if  and  only  if  a  subset  of  the  set  recognized  by  that  state  is 
specified  by  the  set-expression  associated  with  that  element.  In  other  words,  if  a  subset 
of  the  input  consumed  in  reaching  a  given  state  is  recognizable,  then  that  state  should 
recognize  it.  Thus,  no  matter  what  state  the  machine  terminates  in.  the  output  from  that 
state  will  include  information  about  every  recognizable  subset  of  the  consumed  input. 

It  is  clear  from  the  above  definition  that  the  first  state  in  a  machine  which  is  capable  of 
recognizing  a  particular  subset  must  have  an  information  list  concerning  that  set  associated 
with  it;  however,  all  subsequent  states  must  also  recognize  that  set  and,  therefore,  it  would 
appear  that  they  must  also  have  that  same  information  list  associated  with  them.  In  order 
to  avoid  this  redundant  storage  of  information,  the  Subset  Machine  will  be  constructed 
such  that  only  the  first  state  which  is  capable  of  recognizing  a  particular  set  will  provide 
information  about  that  set.  Then,  if  the  driver  is  designed  so  that  it  collects  the  informa¬ 
tion  available  at  each  state  during  the  consumption  of  an  input,  precisely  the  same  set  of 
information  will  generated  upon  termination  of  the  machine  as  if  the  above  definition  were 
strictly  followed.  Thus,  the  earlier  driver  is  enhanced  as  follows: 

function:  DRIVER  (S,  M) ;  /*  Subset  Machine  */ 

local  variables;  i,  n,  info; 


do  while  S  is  not  empty; 


symbol  < —  next_symb(S) ; 
n  < —  number  of  triples  in  H; 
do  i  =  1  to  n; 

if  M(i) .trans.symb  =  symbol  then  begin; 

M  <--  M(i) .trans.list ; 
info  < —  unionCinfo,  M(i) .final. state) ; 
leave;  /♦  exit  loop  */ 
end; 
end; 
end; 

return  info; 
end; 

Essentially,  all  that  has  changed  is  that  now  an  element  of  S  for  which  there  is  no  transition 
from  the  current  state  of  M  is  ignored  without  a  change  of  state  (recall  that  in  the  earlier 
machines  this  situation  implied  that  the  input  set  simply  could  not  be  recognized).  Also, 
the  result  now  represents  a  set  of  information  lists  rather  than  a  single  one. 

It  is  certainly  not  obvious  from  the  present  construction  how  to  determine  whether  any 
subsets  of  the  elements  consumed  prior  to  a  given  state  are  recognizable;  however,  this  is 
precisely  the  information  which  must  be  computed  for  every  state  in  the  machine.  Fortu¬ 
nately,  this  can  be  accomplished  in  a  straightforward  manner  by  utilizing  the  information 
already  computed  by  the  earlier  construction  routine.  Note  that  every  possible  sequence  of 
transitions  in  a  machine  M  represents  a  partition  of  SiM).  For  example,  a  transition  list 
consisting  of  the  k  transitions  (01,02,  ....a^),  ordered  by  6,  partitions  S(M)  such  that  the 
sets  containing  the  elements  a^,  j  <  k,  do  not  contain  the  elements  o,  where  i  <  j.  This  fact 
reveals  immediately  that  no  set  containing  an  a,  can  be  a  subset  of  a  set  containing  an  Oj. 
Thus,  to  identify  the  recognizable  subsets  of  a  set  containing  an  o,,  only  the  .sets  containing 
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an  Oj  where  j  >  i  need  to  be  checked.  This  is  precisely  what  the  following  routine  does  in 
converting  an  SRA  into  a  Subset  Machine: 

function:  SUBSET  (M) ; 

local  variables:  i,  j,  n; 
if  M  is  empty  then  return  nil; 
n  < —  number  of  triples  in  M; 
do  i  *  n  to  1  by  -1; 

M(i)  <—  SUBSET(M(i))  : 
do  j  *  1  to  (i  -  1) ; 

M(j)  <—  0R(M(i).  M(j)); 
end; 
end; 
end; 

An  examination  of  the  above  code  reveals  that  the  conversion  of  an  SRA  to  a  Subset  Machine 
is  a  relatively  expensive  process.  Specifically,  if  it  is  assumed  for  a  machine  M  that  the 
elements  of  U  are  uniformly  distributed  throughout  S{M)  (if  the  distribution  of  elements 
is  not  uniform,  the  size  of  the  machine  can  be  reduced  by  associating  smaller  0- values  with 
elements  which  appear  more  often  and  larger  values  with  elements  that  appear  less  often) 
and  that  the  average  size  of  the  sets  is  roughly  jf'^|/2,  it  is  easy  to  see  that  a  transition 
a,  6{a)  =  k,  wiU  lead  to  a  sub-machine  of  M  roughly  twice  as  large  (in  number  of  states) 
as  a  sub-machine  reachable  via  a  transition  6,  6{b)  =  A*  -t-  1.  In  other  words,  an  arbitrary 
transition  a  in  a  machine  M  having  n  states  will  lead  to  a  sub-machine  of  approximately 
„/2»(“).  Thus,  an  SRA  recognizing  z  sets,  containing  an  average  of  n/2  randomly  selected 
elements,  would  require  0(z\ogz)  computations  for  conversion  to  a  Subset  Machine. 
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7  SRA’s  and  Expert  Systems 


Historically,  expert  systems  grew  out  of  attempts  to  create  general  problem  solving  pro¬ 
grams  using  predicate  logic.  The  idea  behind  expert  systems  is  that  pure  logic  cannot 
be  expected  to  solve  real-world  problems  without  real-world  knowledge.  Thus,  an  expert 
system  has  traditionally  employed  both  an  inference  engine  and  a  knowledge  base.  The 
knowledge  base  usually  consists  of  a  collection  of  IF-THEN  rules  composed  of  logical  AND 
and  OR  conditions  which  are  evaluated  by  the  inference  engine.  Although  superficially 
similar  to  these  IF-THEN  rules,  set-expressions  are  significantly  more  restrictive  in  that 
they  provide  no  facility  for  variable  instantiation.  This  restriction  (among  others)  results  in 
an  extraordinary  reduction  in  computational  requirements  while  maintaining  a  surprising 
degree  of  expressive  power.  In  effect,  the  SRA  construction  process  can  be  viewed  as  a 
compilation  of  IF-THEN  rules  with  constant  operands.  The  result  of  this  compilation  not 
only  processes  much  faster  than  its  interpreted  counterpart  (i.e.  inference  engine  and  rules), 
but  is  also  of  a  more  convenient  form  for  analyzing  the  logical  structure  of  the  knowledge 
base. 

Consider  a  Subset  Machine  resulting  from  the  construction  process.  Means  immediately 
exist  by  which  “holes”  can  be  isolated  in  the  knowledge  base  and  even  pinpoint  ambiguous 
or  contradictory  sets  of  rules  (where  a  rule  is  now  considered  to  be  a  set-expression).  For 
example,  any  state  in  the  machine  which  recognizes  two  or  more  sets  implies  that  the  rules 
describing  those  sets  fail  to  distinguish  them  for  any  input  containing  a  subset  equal  to  the 
set  of  elements  consumed  in  reaching  that  state.  Thus,  it  is  a  simple  matter  for  a  routine 
(even  the  construction  routine)  to  find  these  ambigously-defined  subsets  and  permit  the 
knowledge  engineer  to  qualify  them  with  additional  rules.  In  traditional  expert  systems 
these  conditions  are  exceedingly  difficult,  if  not  impossible,  to  discover  by  means  other  than 
trial-and-error  testing  at  the  front-end  of  the  system. 

Another  type  of  useful  analysis  which  can  be  implemented  is  gap  analysis.  Gap  analysis 
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consists  of  locating  relatively  long  paths  in  the  machine  which  do  not  trigger  the  recognition 
of  any  sets.  These  paths  are  important  because  they  may  represent  holes  in  the  knowledge 
base.  These  holes  are  particularly  difficult  to  locate  in  traditional  expert  systems  because  if 
the  areas  of  the  domain  are  overlooked  in  the  development  stage  of  the  knowledge  base,  they 
stand  a  great  liklihood  of  being  overlooked  in  the  testing  stage.  Thus,  gap  analysis  provides 
a  means  for  helping  to  assure  that  the  knowledge  base  spans  its  given  problem  domain. 
(Gap  antdysis  could  also  be  used  as  a  primary  tool  for  creating  the  knowledge  base.  Once 
the  domain  of  expertise  has  been  established,  the  knowledge  engineer  could  successively 
address  the  largest  gap  in  the  knowledge  base  until  the  domain  is  fully  spanned.  This 
would  provide  an  organized  and  efficient  alternative  to  starting  from  scratch  and  ‘’filling  in 
the  pieces.”) 

8  Summary 

This  paper  began  with  a  combinatorial  argument  suggesting  that  traditional  automata 
theory  is  impractical  for  applications  involving  set  recognition.  A  new  type  of  automaton  (a 
set-recognizing  automaton  or  SRA)  was  then  developed  and  a  notation  (set-expressions)  for 
describing  the  sets  accepted  by  such  a  machine.  Subsequently,  an  algorithm  for  converting  a 
set-expression  to  its  corresponding  SRA  was  developed.  This  algorithm  was  then  enhanced 
to  construct  more  powerful  machines  which  could  have  appbcations  in  artificial  intelligence. 
Specifically,  the  use  of  the  Subset  Machine  was  considered  for  the  implementation  and 
development  of  a  restricted  class  of  expert  systems.  These  very  preliminary  results  suggest 
that  SRA’s  may  have  a  broad  range  of  potential  applications  worthy  of  further  study. 

LISP  Implementation: 

Here  a  simple  LISP  implementation  of  the  algorithms  described  in  constructing  a  Subset 
Machine  is  presented.  The  goal  is  to  provide  machine-executable  code  which  resembles  as 
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much  as  possible  the  algorithmic  descriptions  used  in  this  paper.  As  a  result,  the  LISP 
definitions  display  a  heavy  algol  accent  which  is  intended  to  make  them  more  easily  under¬ 
standable  to  the  reader  who  is  unfamiliar  with  LISP. 


The  following  routines  are  defined  in  order  to  provide  convenient  access  to  the  various  pieces 
of  the  machine: 

(DEFUN  ALPHA  (L)  (CAR  D) 

This  function  returns  the  first  element  (assumed  to  be  an  atom  from  the  list  (7)  of  a  triple. 

(DEFUN  TRANSL  (L)  (CAR  (CDR  L))) 

This  function  returns  the  second  element  (assumed  to  be  a  list  of  triples)  of  a  triple. 

(DEFUN  FINAL  (L)  (CAR  (CDR  (CDR  L)))) 

This  function  returns  the  third  element  (assumed  to  be  a  list  representing  information  about 
recognized  sets)  of  the  triple. 

(DEFUN  TRIPLE  (L)  (LIST  (ALPHA  L) (TRANSL  L) (FINAL  D) 

This  funtion  returns  a  list  whose  elements  constitute  a  triple. 

(DEFUN  REST-OF  (L)  (CDR  (CDR  (CDR  L)))) 

This  function  takes  a  list  L  of  triples  and  returns  all  but  the  first  of  those  triples. 

The  following  pages  containt  LISP  definitions  for  OR-SETS.  A  ND-SETS,  EVALVATE, 
DEFINITIONS,  SUBSET,  and  DRIVER.  (A  comparison  function  LT  is  assumed  to  exist 
which  defines  <  according  to  ^.)  A  simple  example  is  then  provided. 
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(DEFUN  OR- SETS  (Ml  M2) 


(PROG  (MACHINE) 

LOOPl  (COND  ((NULL  Ml)  (RETURN  (APPEND  MACHINE  M2)))) 

(COND  ((NULL  M2)  (RETURN  (APPEND  MACHINE  Ml)))) 

(COND  ((LT  (ALPHA  Ml)  (ALPHA  M2)) 

(SETQ  MACHINE  (APPEND  MACHINE  (TRIPLE  Ml))) 
(SETQ  Ml  (REST-OF  Ml))) 

((LT  (ALPHA  M2)  (ALPHA  Ml)) 

(SETQ  MACHINE  (APPEND  MACHINE  (TRIPLE  M2))) 
(SETQ  M2  (REST-OF  M2))) 

(T  (SETQ  MACHINE 

(APPEND  MACHINE 

(LIST  (ALPHA  Ml) 

(OR-SETS  (TRANSL  Ml) 

(TRANSL  M2)) 
(UNION  (FINAL  Ml) 

(FINAL  M2))))) 

(SETQ  Ml  (REST-OF  Ml)) 

(SETQ  M2  (REST-OF  M2)))) 

(GO  LOOPl))) 


(DEFUN  AND-SETS  (Ml  M2) 

(PROG  (MACHINE  T1  T2  T1T2) 

LOOPl  (COND  ((OR  (NULL  Ml)  (NULL  M2))  (RETURN  MACHINE))) 
(COND  ((LT  (ALPHA  Ml)  (ALPHA  M2)) 

(SETQ  T1  (LIST  (ALPHA  Ml) 
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(AND-SETS  (TRANSL  Ml)  M2) 


NIL)) 

(COND  ((FINAL  Ml) 

(SETQ  T1  (LIST  (ALPHA  Tl) 

(OR- SETS  (TRANSL  Tl) 
M2) 

NIL)))) 

(SETQ  Ml  (REST-OF  Ml)) 

(SETQ  MACHINE  (APPEND  MACHINE  Tl))) 
((LT  (ALPHA  M2)  (ALPHA  Ml)) 

(SETQ  T2  (LIST  (ALPHA  M2) 

(AND-SETS  (TRANSL  M2)  Ml) 
NIL)) 

(COND  ((FINAL  M2) 

(SETQ  T2  (LIST  (ALPHA  T2) 

(OR-SETS  (TRANSL  T2) 
Ml) 

NIL)))) 

(SETQ  M2  (REST-OF  M2)) 

(SETQ  MACHINE  (APPEND  MACHINE  T2))) 

(T  (SETQ  Tl  (AND-SETS  (TRANSL  Ml) 

(TRANSL  M2))) 

(COND  ((FINAL  Ml) 

(SETQ  Tl  (OR-SETS 

Tl 

(REST-OF  M2))))) 

(COND  ((FINAL  M2) 
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(SETQ  T1  (OR-SETS 

T1 

(REST-OF  Ml))))) 

(SETQ  MACHINE 

(APPEND  MACHINE 

(LIST  (ALPHA  Ml) 

T1 

(UNION 

(COND 

((TRANSL  Ml  NIL) 

(T  (FINAL  Ml))) 
(COND 

((TRANSL  M2  NIL) 

(T  (FINAL  M2))))))) 

(SETQ  Ml  (REST-OF  Ml)) 

(SETQ  M2  (REST-OF  M2)))) 

(GO  LOOPl))) 

(DEFUN  EVALUATE  (EXP) 

(PROG  (SYMBOL  PROD  SUM) 

LOOP  (COND  ((NULL  EXP)  (RETURN  (OR-SETS  SUM  PROD)))) 
(SETQ  SYMBOL  (CAR  EXP)) 

(SETQ  EXP  (CDR  EXP)) 

(COND  ((OR  (EQUAL  SYMBOL  ’AND)  (EQUAL  SYMBOL  ’*)) 
(GO  LOOP)) 

(COND  ((OR  (EQUAL  SYMBOL  'OR)  (EQUAL  SYMBOL  ’+)) 
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(SETQ  SUM  (OR-SETS  SUM  PROD)) 

(SETQ  PROD  NIL) 

(GO  LOOP)) 

((LISTP  SYMBOL)  (SETQ  SYMBOL  (EVALUATE  SYMBOL))) 
(T  (SETQ  SYMBOL  (LIST  SYMBOL  NIL  FS-INFO)))) 
(COND  ((NULL  PROD)  (SETQ  PROD  SYMBOL)) 

(T  SETQ  PROD  (AND-SETS  PROD  SYMBOL)))) 

(GO  LOOP))) 


(DEFUN  DEFINITIONS  (EXPRESSIONS) 

(PROG  (MACHINE  EXP) 

LOOP  (COND  ((NULL  EXPRESSIONS)  (RETURN  (SUBSET  MACHINE)))) 
(SETQ  EXP  (CAR  EXPRESSIONS)) 

(SETQ  EXPRESSIONS  (CDR  EXPRESSIONS)) 

(SETQ  FS-INFO  (CAR  EXPRESSIONS)) 

(SETQ  EXPRESSIONS  (CDR  EXPRESSIONS)) 

(SETQ  MACHINE  (OR-SETS  MACHINE  (EVALUATE  EXP))) 

(GO  LOOP))) 


(DEFUN  SUBSET  (M) 

(COND  ((NULL  M)  NIL) 

(T  (APPEND  (LIST  (ALPHA  M) 

(SUBSET  (OR-SETS  (TRANSL  M) 

(REST-OF  M))) 


(FINAL  M)) 
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(SUBSET  (REST-OF  M)))))) 


(DEFUN  DRIVER  (S  M) 

(PROG  (INFO  TEMP) 

LOOP  (COND  ((NULL  S)  (RETURN  INFO))) 

(SETQ  SYMBOL  (CAR  S)) 

(SETQ  S  (CDR  S)) 

(SETQ  TEMP  M) 

L00P2  (COND  ((NULL  TEMP)  (GO  LOOP)) 

((EQUAL  SYMBOL  (ALPHA  TEMP)) 

(SETQ  M  (TRANSL  TEMP)) 

(SETQ  INFO  (UNION  INFO  (FINAL  TEMP))) 
(GO  LOOP)) 

(T  (SETQ  TEMP  (REST-OF  TEMP)))) 

(GO  L00P2))) 

A  EXAMPLE 

The  foUowing  defines  a  machine  via  three  set-expressions: 

(SETQ  EXPRESSION  '(  (12+1  1)  (120R13) 

(1  2  3)  (123) 

(1  +  3)  (10R3)  )  ) 

EXPRESSION  is  then  converted  to  a  machine  M: 

(SETQ  M  (DEFINITIONS  EXPRESSION)) 
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The  DRIVER  can  then  be  used  to  examine  some  set  (represented  by  a  list  that  is  assumed 
to  have  been  ordered  by  6).  For  example. 


(DRIVER  ’(1  2)  M)  ==>  (120R13  10R3) 

(DRIVER  ’(1  23)  M)  ==>  (120R13  123  10R3) 

(DRIVER  ’(2  3)  M)  ==>  (1QR3) 

Additional  Operators: 

For  completeness  the  complements  of  the  AND  and  OB  operations  should  be  considered. 
These  operations  have  not  been  discussed  thus  far  because,  although  quite  powerful  in  terms 
of  discriptive  conciseness,  their  indiscriminate  use  may  result  in  excessive  computations. 
Furthermore,  determining  what  sets  are  described  by  expressions  involving  these  operations 
tends  to  be  far  less  intuitive  than  for  expressions  using  only  AND's  and  OB's:  therefore, 
their  use  may  be  prone  to  error  (or  huge  computational  expense)  if  extreme  care  is  not 
taken.  Recall  their  definitions: 

4.  If  Fj  and  F2  a^re  set-expressions  which  denote  sets  and  52,  repectively.  tlien  (£j  -  £2) 
denotes  the  set  Si  —  S2  (where  in  this  case  represents  ordinary  set  difference). 

5.  If  El  and  E2  are  set-expressions  which  denote  sets  Si  and  S2,  repectively.  then  [Eij E2) 
denotes  the  set  {tj  —  (<i  H  <2)  I  fj  €  Si  and  <2  €  S2). 

Thus,  the  set-expression  a  —  b  denotes  the  set  of  sets  in  a  which  are  not  in  b.  For  example, 
the  expression  (/"  —  (/”  denotes  the  set  of  <ill  sets  of  size  C,  m  <  £  <  n,  over  (’.  The 
expression  a/6,  on  the  other  hand,  denotes  the  sets  in  a  which  do  not  have  a  subset  in  6. 
For  example,  the  expression  U'^/ab,  |£j  =  n,  denotes  the  set  of  all  sets  over  V  which  do 
not  contain  the  elements  a  and  6.  The  algorithm  for  implementing  these  operations  are  as 
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follows; 


function:  WIDTH.DIFFERENCE  (Ml,  M2); 

local  variables:  machine,  tl,  t2,  temp,  fstate; 
do  forever; 

if  Ml  is  empty  then  return  machine; 

if  M2  is  empty  then  add  Ml  to  machine  and  return  machine; 
if  Ml .trans.symb  <  M2.trans_symb  then 
add  next_triple(Ml)  to  machine; 
else  if  M2.trans_symb  <  Ml .trans.symb  then 
temp  < —  next_triple(M2) ; 
else  begin; 

tl  < —  next_triple(Ml)  ; 
t2  < —  next_triple(M2) ; 

if  M2.f inal.state  is  not  empty  then  fstate  <--  nil; 
else  fstate  <  Ml .f inal.state ; 
temp  < —  tripleCtl .trans.symb , 

WIDTH.DIFFERENCE  (tl . trans.list , 
t2.trans_list) , 


fstate) ; 

if  temp .trans.list  is  not  empty  or  temp .final. state 
is  not  empty  then  add  temp  to  machine; 

end; 

end; 

end; 


function;  DEPTH.DIFFERENCE  (Ml,  M2); 


29 


local  variables:  machine,  tl,  t2,  temp,  f state; 


do  forever; 

if  Ml  is  empty  or  M2  is  empty  then  return  machine; 
if  M2 .trans.symb  <  ml .trans_symb  then 
temp  < —  next_triple(M2) ; 
else  begin; 

tl  < —  next_triple(Ml) ; 
if  tl .trans.symb  <  M2.trans_symb  then 
temp  < —  triple (t 1 .trans.symb, 

DEPTH_DIFFERENCE(tl.trans_list.  M2) , 
tl .f inal.state) ; 

else  begin; 

t2  < —  next_triple(M2) ; 
if  M2.final_state  is  empty  then 
temp  <--  triple (t 1 .trans.symb, 

DEPTH.DIFFERENCECtl . trans.list , 
t2.trans_list) , 

tl .f inal_state) ; 

else  temp  < —  triple(nil,  nil,  nil); 
end; 

if  temp .trans.list  is  not  empty  or  temp .f inal.state 
is  not  empty  then  add  temp  to  machine; 

end ; 
end; 
end; 

These  operations  may  be  translated  into  LISP  as  follows: 
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(DEFUN  WIDTH-DIFFERENCE  (Ml  M2) 


(PROG  (MACHINE  TEMP) 

LOOP  (COND  ((NULL  Ml)  (RETURN  MACHINE)) 

((NULL  M2)  (RETURN  (APPEND  MACHINE  Ml))) 

((LT  (ALPHA  M2)  (ALPHA  Ml)) 

(SETQ  M2  (REST-OF  M2))) 

((LT  (ALPHA  Ml)  (ALPHA  M2)) 

(SETQ  Ml  (REST-OF  Ml))) 

(T  (SETQ  TEMP 

(LIST  (ALPHA  Ml) 

(WIDTH-DIFFERENCE  (TRANSL  Ml) 

(TRANSL  M2)) 

(COND  ((FINAL  M2)  NIL) 

(T  (FINAL  Ml))))) 

(COND  ((OR  (TRANSL  TEMP)  (FINAL  TEMP)) 

(SETQ  MACHINE  (APPEND  MACHINE  TEMP)))) 
(SETQ  Ml  (REST-OF  Ml)) 

(SETQ  M2  (REST-OF  M2)) 

(GO  LOOP))) 


(DEFUN  DEPTH-DIFFERENCE  (Ml  M2) 

(PROG  (MACHINE  TEMP) 

LOOPl  (COND  ((NULL  Ml)  (RETURN  MACHINE)) 

((NULL  M2)  (RETURN  (APPEND  MACHINE  Ml))) 
((LT  (ALPHA  M2)  (ALPHA  Ml)) 

(SETQ  M2  (REST-OF  M2)) 
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(GO  LOOP!)) 


((LT  (ALPHA  Ml)  (ALPHA  M2)) 

(SETQ  TEMP 
(LIST  (ALPHA  Ml) 

(DEPTH-DIFFERENCE  (TRANSL  Ml)  M2) 
(FINAL  Ml)))) 

(T  (SETQ  TEMP 

(COND  ((FINAL  M2) 

(LIST  NIL  NIL  NIL)) 

(T  (LIST  (ALPHA  Ml) 

(DEPTH-DIFFERENCE 
(TRANSL  Ml)  (TRANSL  M2)) 
(FINAL  Ml))))) 

(SETQ  M2  (REST-OF  M2)) 

(SETQ  Ml  (REST-OF  Ml)) 

(COND  ((OR  (TRANSL  TEMP)  (FINAL  TEMP)) 

(SETQ  MACHINE  (APPEND  MACHINE  TEMP)))) 

(GO  LOOPl))) 
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